Empirical p-value for seing a domain a number of times
What is the chance of randomly seeing any domain the observed number of times among all proteins that interact with a specific viral protein
printTable(res_count, doman_viral_pairs = doman_viral_pairs, motifs = motifs)
writeTable(res_count, "./results/domains_empirical_p_value.tsv")
plot(res_count)

Fisher test for co-occurence of binding viral protein and containing a domain
printTable(resJustFISHER, doman_viral_pairs = doman_viral_pairs, motifs = motifs)
writeTable(resJustFISHER, "./results/domains_fisher_test.tsv")
plot(resJustFISHER,IDs_interactor_viral + IDs_domain_human ~ p.value , xlab = "Fisher's Exact Test pvalue", breaks = seq(-0.01,
1.01, 0.01))

Combining the p-value for seing a domain a number of times and the p-value for co-occurence of binding viral protein and containing a domain
Multiplying p-values
printTable(resPmult, doman_viral_pairs = doman_viral_pairs, motifs = motifs)
writeTable(resJustFISHER, "./results/domains_empirical_p_value_X_fisher_test.tsv")
plot(resPmult,IDs_interactor_viral + IDs_domain_human ~ p.value , xlab = "Fisher's Exact Test pvalue * \nempirical P value for observing domain in N proteins", breaks = seq(-0.01,
1.01, 0.01))

Multiplying the inverse of p-values
The idea is that low p-values mean higher chances of detecting a signal. I am not sure this is statistically correct, but it allows to remove p = 1.0 domains (because of multiplying Fisher p value by 0, the inverse of empirical pvalue for the frequency).
printTable(resPmultInv, doman_viral_pairs = doman_viral_pairs, motifs = motifs)
writeTable(resJustFISHER, "./results/domains_inv_empirical_p_value_X_inv_fisher_test.tsv")
plot(resPmultInv,IDs_interactor_viral + IDs_domain_human ~ p.value , xlab = "Inverse of Fisher's Exact Test pvalue * \ninverse of empirical P value for observing domain in N proteins", breaks = seq(-0.01,
1.01, 0.01))

2-step filtering, ranking by Fisher test p-value
printTable(sequential_filter, doman_viral_pairs = doman_viral_pairs, motifs = motifs)
plot(sequential_filter, IDs_interactor_viral + IDs_domain_human ~ p.value, xlab = "Fisher's Exact Test pvalue", breaks = seq(-0.01,
1.01, 0.01))

PermutResult2D(res = sequential_filter, N = 500, value.cols = c("p.value", "Emp.p.value")) +
ggtitle("2D-bin plots of 250 top-scoring viral protein - human domain pairs, \n statistic: count of a domain among interacting partners of a viral protein")
## Warning: Removed 242 rows containing non-finite values (stat_density).
## Warning in (function (data, mapping, alignPercent = 0.6, method =
## "pearson", : Removed 242 rows containing missing values
## Warning: Transformation introduced infinite values in continuous y-axis
## Warning: Removed 278 rows containing non-finite values (stat_bin2d).
## Warning: Removed 242 rows containing non-finite values (stat_density).

writeTable(sequential_filter, "./results/domains_sequential_filter.tsv")